{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "# Fundamentals of machine learning"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## Generalization: The goal of machine learning"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Underfitting and overfitting"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Noisy training data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Ambiguous features"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Rare features and spurious correlations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "**Adding white-noise channels or all-zeros channels to MNIST**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from tensorflow.keras.datasets import mnist\n",
    "import numpy as np\n",
    "\n",
    "(train_images, train_labels), _ = mnist.load_data()\n",
    "train_images = train_images.reshape((60000, 28 * 28))\n",
    "train_images = train_images.astype(\"float32\") / 255\n",
    "\n",
    "train_images_with_noise_channels = np.concatenate(\n",
    "    [train_images, np.random.random((len(train_images), 784))], axis=1)\n",
    "\n",
    "train_images_with_zeros_channels = np.concatenate(\n",
    "    [train_images, np.zeros((len(train_images), 784))], axis=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "**Training the same model on MNIST data with noise channels or all-zero channels**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from tensorflow import keras\n",
    "from tensorflow.keras import layers\n",
    "\n",
    "def get_model():\n",
    "    model = keras.Sequential([\n",
    "        layers.Dense(512, activation=\"relu\"),\n",
    "        layers.Dense(10, activation=\"softmax\")\n",
    "    ])\n",
    "    model.compile(optimizer=\"rmsprop\",\n",
    "                  loss=\"sparse_categorical_crossentropy\",\n",
    "                  metrics=[\"accuracy\"])\n",
    "    return model\n",
    "\n",
    "model = get_model()\n",
    "history_noise = model.fit(\n",
    "    train_images_with_noise_channels, train_labels,\n",
    "    epochs=10,\n",
    "    batch_size=128,\n",
    "    validation_split=0.2)\n",
    "\n",
    "model = get_model()\n",
    "history_zeros = model.fit(\n",
    "    train_images_with_zeros_channels, train_labels,\n",
    "    epochs=10,\n",
    "    batch_size=128,\n",
    "    validation_split=0.2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "**Plotting a validation accuracy comparison**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "val_acc_noise = history_noise.history[\"val_accuracy\"]\n",
    "val_acc_zeros = history_zeros.history[\"val_accuracy\"]\n",
    "epochs = range(1, 11)\n",
    "plt.plot(epochs, val_acc_noise, \"b-\",\n",
    "         label=\"Validation accuracy with noise channels\")\n",
    "plt.plot(epochs, val_acc_zeros, \"b--\",\n",
    "         label=\"Validation accuracy with zeros channels\")\n",
    "plt.title(\"Effect of noise channels on validation accuracy\")\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.ylabel(\"Accuracy\")\n",
    "plt.legend()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### The nature of generalization in deep learning"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "**Fitting a MNIST model with randomly shuffled labels**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "(train_images, train_labels), _ = mnist.load_data()\n",
    "train_images = train_images.reshape((60000, 28 * 28))\n",
    "train_images = train_images.astype(\"float32\") / 255\n",
    "\n",
    "random_train_labels = train_labels[:]\n",
    "np.random.shuffle(random_train_labels)\n",
    "\n",
    "model = keras.Sequential([\n",
    "    layers.Dense(512, activation=\"relu\"),\n",
    "    layers.Dense(10, activation=\"softmax\")\n",
    "])\n",
    "model.compile(optimizer=\"rmsprop\",\n",
    "              loss=\"sparse_categorical_crossentropy\",\n",
    "              metrics=[\"accuracy\"])\n",
    "model.fit(train_images, random_train_labels,\n",
    "          epochs=100,\n",
    "          batch_size=128,\n",
    "          validation_split=0.2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### The manifold hypothesis"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Interpolation as a source of generalization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Why deep learning works"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Training data is paramount"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## Evaluating machine-learning models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Training, validation, and test sets"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Simple hold-out validation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### K-fold validation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Iterated K-fold validation with shuffling"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Beating a common-sense baseline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Things to keep in mind about model evaluation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## Improving model fit"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Tuning key gradient descent parameters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "**Training a MNIST model with an incorrectly high learning rate**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "(train_images, train_labels), _ = mnist.load_data()\n",
    "train_images = train_images.reshape((60000, 28 * 28))\n",
    "train_images = train_images.astype(\"float32\") / 255\n",
    "\n",
    "model = keras.Sequential([\n",
    "    layers.Dense(512, activation=\"relu\"),\n",
    "    layers.Dense(10, activation=\"softmax\")\n",
    "])\n",
    "model.compile(optimizer=keras.optimizers.RMSprop(1.),\n",
    "              loss=\"sparse_categorical_crossentropy\",\n",
    "              metrics=[\"accuracy\"])\n",
    "model.fit(train_images, train_labels,\n",
    "          epochs=10,\n",
    "          batch_size=128,\n",
    "          validation_split=0.2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "**The same model with a more appropriate learning rate**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential([\n",
    "    layers.Dense(512, activation=\"relu\"),\n",
    "    layers.Dense(10, activation=\"softmax\")\n",
    "])\n",
    "model.compile(optimizer=keras.optimizers.RMSprop(1e-2),\n",
    "              loss=\"sparse_categorical_crossentropy\",\n",
    "              metrics=[\"accuracy\"])\n",
    "model.fit(train_images, train_labels,\n",
    "          epochs=10,\n",
    "          batch_size=128,\n",
    "          validation_split=0.2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Leveraging better architecture priors"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Increasing model capacity"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "**A simple logistic regression on MNIST**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential([layers.Dense(10, activation=\"softmax\")])\n",
    "model.compile(optimizer=\"rmsprop\",\n",
    "              loss=\"sparse_categorical_crossentropy\",\n",
    "              metrics=[\"accuracy\"])\n",
    "history_small_model = model.fit(\n",
    "    train_images, train_labels,\n",
    "    epochs=20,\n",
    "    batch_size=128,\n",
    "    validation_split=0.2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "val_loss = history_small_model.history[\"val_loss\"]\n",
    "epochs = range(1, 21)\n",
    "plt.plot(epochs, val_loss, \"b--\",\n",
    "         label=\"Validation loss\")\n",
    "plt.title(\"Effect of insufficient model capacity on validation loss\")\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.ylabel(\"Loss\")\n",
    "plt.legend()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential([\n",
    "    layers.Dense(96, activation=\"relu\"),\n",
    "    layers.Dense(96, activation=\"relu\"),\n",
    "    layers.Dense(10, activation=\"softmax\"),\n",
    "])\n",
    "model.compile(optimizer=\"rmsprop\",\n",
    "              loss=\"sparse_categorical_crossentropy\",\n",
    "              metrics=[\"accuracy\"])\n",
    "history_large_model = model.fit(\n",
    "    train_images, train_labels,\n",
    "    epochs=20,\n",
    "    batch_size=128,\n",
    "    validation_split=0.2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## Improving generalization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Dataset curation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Feature engineering"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Using early stopping"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Regularizing your model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Reducing the network's size"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "**Original model**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from tensorflow.keras.datasets import imdb\n",
    "(train_data, train_labels), _ = imdb.load_data(num_words=10000)\n",
    "\n",
    "def vectorize_sequences(sequences, dimension=10000):\n",
    "    results = np.zeros((len(sequences), dimension))\n",
    "    for i, sequence in enumerate(sequences):\n",
    "        results[i, sequence] = 1.\n",
    "    return results\n",
    "train_data = vectorize_sequences(train_data)\n",
    "\n",
    "model = keras.Sequential([\n",
    "    layers.Dense(16, activation=\"relu\"),\n",
    "    layers.Dense(16, activation=\"relu\"),\n",
    "    layers.Dense(1, activation=\"sigmoid\")\n",
    "])\n",
    "model.compile(optimizer=\"rmsprop\",\n",
    "              loss=\"binary_crossentropy\",\n",
    "              metrics=[\"accuracy\"])\n",
    "history_original = model.fit(train_data, train_labels,\n",
    "                             epochs=20, batch_size=512, validation_split=0.4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "**Version of the model with lower capacity**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential([\n",
    "    layers.Dense(4, activation=\"relu\"),\n",
    "    layers.Dense(4, activation=\"relu\"),\n",
    "    layers.Dense(1, activation=\"sigmoid\")\n",
    "])\n",
    "model.compile(optimizer=\"rmsprop\",\n",
    "              loss=\"binary_crossentropy\",\n",
    "              metrics=[\"accuracy\"])\n",
    "history_smaller_model = model.fit(\n",
    "    train_data, train_labels,\n",
    "    epochs=20, batch_size=512, validation_split=0.4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "**Version of the model with higher capacity**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential([\n",
    "    layers.Dense(512, activation=\"relu\"),\n",
    "    layers.Dense(512, activation=\"relu\"),\n",
    "    layers.Dense(1, activation=\"sigmoid\")\n",
    "])\n",
    "model.compile(optimizer=\"rmsprop\",\n",
    "              loss=\"binary_crossentropy\",\n",
    "              metrics=[\"accuracy\"])\n",
    "history_larger_model = model.fit(\n",
    "    train_data, train_labels,\n",
    "    epochs=20, batch_size=512, validation_split=0.4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Adding weight regularization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "**Adding L2 weight regularization to the model**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from tensorflow.keras import regularizers\n",
    "model = keras.Sequential([\n",
    "    layers.Dense(16,\n",
    "                 kernel_regularizer=regularizers.l2(0.002),\n",
    "                 activation=\"relu\"),\n",
    "    layers.Dense(16,\n",
    "                 kernel_regularizer=regularizers.l2(0.002),\n",
    "                 activation=\"relu\"),\n",
    "    layers.Dense(1, activation=\"sigmoid\")\n",
    "])\n",
    "model.compile(optimizer=\"rmsprop\",\n",
    "              loss=\"binary_crossentropy\",\n",
    "              metrics=[\"accuracy\"])\n",
    "history_l2_reg = model.fit(\n",
    "    train_data, train_labels,\n",
    "    epochs=20, batch_size=512, validation_split=0.4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "**Different weight regularizers available in Keras**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from tensorflow.keras import regularizers\n",
    "regularizers.l1(0.001)\n",
    "regularizers.l1_l2(l1=0.001, l2=0.001)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Adding dropout"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "**Adding dropout to the IMDB model**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential([\n",
    "    layers.Dense(16, activation=\"relu\"),\n",
    "    layers.Dropout(0.5),\n",
    "    layers.Dense(16, activation=\"relu\"),\n",
    "    layers.Dropout(0.5),\n",
    "    layers.Dense(1, activation=\"sigmoid\")\n",
    "])\n",
    "model.compile(optimizer=\"rmsprop\",\n",
    "              loss=\"binary_crossentropy\",\n",
    "              metrics=[\"accuracy\"])\n",
    "history_dropout = model.fit(\n",
    "    train_data, train_labels,\n",
    "    epochs=20, batch_size=512, validation_split=0.4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## Summary"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "chapter05_fundamentals-of-ml.i",
   "private_outputs": false,
   "provenance": [],
   "toc_visible": true
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}